Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize cellsize for lon-lat projections #768

Open
wants to merge 24 commits into
base: main
Choose a base branch
from

Conversation

tiemvanderdeure
Copy link
Contributor

Solves #764 and makes cellsize calculation much faster for lon-lat projections.

I'm considering getting rid of the ArchGDAL requirement for these projections as well.

Any ideas for further improvements @rafaqz @alex-s-gardner?

@rafaqz
Copy link
Owner

rafaqz commented Sep 27, 2024

How do we get rid of ArchGDAL?

@tiemvanderdeure
Copy link
Contributor Author

I was thinking we could just make a _crs2transform that just calls ArchGDAL.crs2transform if ArchGDAL is loaded and otherwise tells the user to load ArchGDAL. If the CRS is WGS84 then cellsize would never have to call _crs2transform

@rafaqz
Copy link
Owner

rafaqz commented Sep 27, 2024

But how do we deal with Proj/Wkt, a lot of different text strings can be wgs84

@tiemvanderdeure
Copy link
Contributor Author

But how do we deal with Proj/Wkt, a lot of different text strings can be wgs84

The way we already do it - I don't think this line of code needs ArchGDAL

if convert(CoordSys, crs(dims)) == CoordSys("Earth Projection 1, 104") # check if need to reproject

@rafaqz
Copy link
Owner

rafaqz commented Sep 27, 2024

Yeah that's ArchGDAL doing that conversion ;)

@tiemvanderdeure
Copy link
Contributor Author

tiemvanderdeure commented Sep 27, 2024

Yeah that's ArchGDAL doing that conversion ;)

I was thrown off by the GeoFormatTypes wrapper, but makes sense. Let's just leave it as it is, then.

@rafaqz
Copy link
Owner

rafaqz commented Sep 27, 2024

Yes ArchGDAL actually pirates Base.convert on GeoFormat, so the confusion is understandable. We could put it in a GeoFormatTypesArchGDAL extension now, but that wasn't possible at the time. We could also switch to using Proj.jl directly.

@alex-s-gardner
Copy link
Contributor

Can we take this opportunity to rename cellsize to cellarea and to return area in units of meters #747, also adding a units kwarg?

@alex-s-gardner
Copy link
Contributor

alex-s-gardner commented Sep 27, 2024

There is room for significant speed gains for small small grid spacing:

#For lat/lon aligned rasters with small grid spacing, this will be much faster and just as accurate:

# load packages
import Rasters: EPSG
using Rasters
using Rasters.Lookups
using DimensionalData


# define meters to lat/lon function
#"""
    meters2lonlat_distance(distance_meters, latitude_degrees)

Returns the decimal degree distance along latitude and longitude lines given a distance in 
meters and a latitude in decimal degrees.

# Example usage:
#```julia-repl
julia> distance_meters = 1000.0;  
julia> latitude_degrees = 45.0;  

julia> lat, lon = Altim.meters2lonlat_distance(distance_meters, latitude_degrees)
(0.008997741566866717, 0.012718328120254203)
#```
#"""
function meters2lonlat_distance(distance_meters, latitude_degrees)
    # Radius of the Earth in meters
    earth_radius = 6371000

    # Calculate the angular distance in radians
    angular_distance = distance_meters / earth_radius

    # Calculate the longitude distance using the Haversine formula
    longitude_distance = angular_distance * (180.0 / π) / cosd(latitude_degrees)

    latitude_distance = distance_meters / 111139

    return latitude_distance, longitude_distance
end

# build an example raster
dX = 0.1
dY = -0.1
lon = X(Projected(166.:dX:168.; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dX), crs=EPSG(4326)))
lat = Y(Projected(-78.0:dY:-80.; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dY ), crs=EPSG(4326)))
ras = Raster(rand(lon, lat))

# given a lat/lon raster with small grid spacing caclculate area
dlon = dims(ras, :X)
dlat = dims(ras, :Y)
lonlat_per_meter = meters2lonlat_distance.(Ref(1), dlat)

dist_lon = step(dlon) ./ getindex.(lonlat_per_meter, 1)
dist_lat = step(dlat) ./ getindex.(lonlat_per_meter, 2)
area = ones(dlon) * DimArray(dist_lon .* dist_lat, dlat)'

@rafaqz
Copy link
Owner

rafaqz commented Sep 27, 2024

Can we take this opportunity to rename cellsize to cellarea and to return area in units of meters #747, also adding a units kwarg?

Let's wait for a complete Unitful extension to add the units keyword to everything all at once. (in a few months when my PhD is done 😅)

I'm happy to rename but we should @deprecate the current name so it still works for a while.

@alex-s-gardner
Copy link
Contributor

I'm happy to rename but we should @deprecate the current name so it still works for a while.

given the incorrectness of cellsize it might actually be better to break it

@tiemvanderdeure
Copy link
Contributor Author

There is room for significant speed gains for small small grid spacing:

Have you compared to the implementation in this PR? For me it is giving similar speeds.

I can see the point of your implementation, but using the haversine formula also has downsides, since it doesn't account for the curvature of the earth. I know this isn't a big deal in most cases, but the error increases with the size of gridcells. So after (dis)aggregating a raster, the total area returned by cellarea would be different, which is not ideal.

Maybe we should implement it for other projections than lon-lat, but then we might need to do some more geometry.

@tiemvanderdeure
Copy link
Contributor Author

Can we take this opportunity to rename cellsize to cellarea and to return area in units of meters #747, also adding a units kwarg?

Let's wait for a complete Unitful extension to add the units keyword to everything all at once. (in a few months when my PhD is done 😅)

Maybe we should already start returning the result in metres though, to avoid another breaking change down the line. If we add a units keyword later then it would default to m^2.

@rafaqz
Copy link
Owner

rafaqz commented Sep 28, 2024

Yeah result in metres is fine, just not the units kw

@alex-s-gardner
Copy link
Contributor

Have you compared to the implementation in this PR? For me it is giving similar speeds.

Your implementation knocks the socks off of my proposal (2x faster) and is more accurate so ignore my recommendation. Excited to see this progressing... fast, useful and intuitive... couldn't ask for anything more

@rafaqz
Copy link
Owner

rafaqz commented Oct 1, 2024

Is this good to go?

@tiemvanderdeure
Copy link
Contributor Author

I think so! Can you or @asinghvi17 just nod at the logic in the line of code that reprojects to mappedcrs? I don't think we need an option to turn that behaviour off, right? I guess users vouch for the accuracy of that transformation by providing a mappedcrs.

@asinghvi17
Copy link
Contributor

Feel free to ignore the cosd/sind stuff if there's no time, I just discovered that they use an extended-precision deg2rad that might be useful.

ext/RastersArchGDALExt/cellarea.jl Show resolved Hide resolved
test/cellarea.jl Show resolved Hide resolved
src/extensions.jl Show resolved Hide resolved
ext/RastersArchGDALExt/cellarea.jl Outdated Show resolved Hide resolved
ext/RastersArchGDALExt/cellarea.jl Outdated Show resolved Hide resolved
ext/RastersArchGDALExt/cellarea.jl Outdated Show resolved Hide resolved
ext/RastersArchGDALExt/cellarea.jl Show resolved Hide resolved
ext/RastersArchGDALExt/cellarea.jl Outdated Show resolved Hide resolved
ext/RastersArchGDALExt/cellarea.jl Show resolved Hide resolved
@tiemvanderdeure
Copy link
Contributor Author

Thanks for reviewing @asinghvi17! I didn't know about cosd and sind so that was really helpful.

I think this is really solid now! The only thing I could think of that we could still add is an option to disable the transformation and just return the cartesian area for planar projections. Should I add it?

@alex-s-gardner
Copy link
Contributor

I think this is really solid now! The only thing I could think of that we could still add is an option to disable the transformation and just return the cartesian area for planar projections. Should I add it?

I think this would be super useful. If the user understands their projection and just want map ptojected area in map units they could simply pass a kwarg like area_in_crs or similar

@tiemvanderdeure
Copy link
Contributor Author

The only thing I could think of that we could still add is an option to disable the transformation and just return the cartesian area for planar projections.

I implemented this now and it works (even without ArchGDAL). I don't know if I like area_in_crs so much, though. terra uses transform, but that's also not super clear, is it?

@asinghvi17
Copy link
Contributor

asinghvi17 commented Oct 2, 2024

Ideally we would say cellarea(Linear(), raster) or cellarea(Spherical(), raster) I guess, that will have to wait for a bunch of reactors in GeometryOps so that Rasters can depend on a "GeometryOpsCore" that exports these types.

Then if one wants as much precision as possible it's even possible to do cellarea(Geodesic(some_datum), raster)

@alex-s-gardner
Copy link
Contributor

Ideally we would say cellarea(Linear(), raster) or cellarea(Spherical(), raster)

It seems like now would be the time to do this as this is a breaking change. We could make it easy and just remove kwarg in favor of Linear() and Spherical() types that could be internally defined... then we can expand in future to be more flexible and to eventually integrate GeometryOps... but at a later date?

@asinghvi17
Copy link
Contributor

asinghvi17 commented Oct 2, 2024

It would take about 3 days to get out, but we could technically define a GeometryOpsCore.jl package today and get it registered, Rasters could then depend on that (it should have way less dependencies / load time).

I wouldn't want to define them internally because someone using this version of Rasters and a future version of GeometryOps will get quite a bit of incompatibility.

@rafaqz
Copy link
Owner

rafaqz commented Oct 2, 2024

We have also been talking about singletons like that (in person here, sorry to be exclusive!).

But it makes sense to share them with GeometryOps.jl

_area_from_coords(transform, geom) = _area_from_coords(transform, GI.trait(geom), geom)
function _area_from_coords(transform::AG.CoordTransform, ::GI.LinearRingTrait, ring)
points = map(GI.getpoint(ring)) do p
t = AG.transform!(AG.createpoint(p...), transform)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createpoint is super slow, its allocating C objects that need a finalizer and GC for every single point. Can we run the transformation on larger geometries an skip this? Like just GI.convert(LinearRing, ring) instead?

Comment on lines 48 to 53
function _area_from_coords(transform::AG.CoordTransform, ::GI.LinearRingTrait, ring)
points = map(GI.getpoint(ring)) do p
t = AG.transform!(AG.createpoint(p...), transform)
(GI.x(t), GI.y(t))
end
return _spherical_quadrilateral_area(GI.LinearRing(points))
Copy link
Owner

@rafaqz rafaqz Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function _area_from_coords(transform::AG.CoordTransform, ::GI.LinearRingTrait, ring)
points = map(GI.getpoint(ring)) do p
t = AG.transform!(AG.createpoint(p...), transform)
(GI.x(t), GI.y(t))
end
return _spherical_quadrilateral_area(GI.LinearRing(points))
function _area_from_coords(transform::AG.CoordTransform, trait::GI.AbstractCurveTrait, ring)
t = AG.transform!(GI.convert(AG.geointerface_geomtype(trait), ring), transform)
return _spherical_quadrilateral_area(GI.convert(GI.geointerface_geomtype(trait), t))

Something like this should be better as GI skips around ArchGDAL point creation and just works on the internal vectors of points in the LinearRing

Copy link
Owner

@rafaqz rafaqz Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guessing this will be 3x faster

It also means a lot less calls to transform! as well as less allocations

(We could also just add a Proj dep and skip around GDAL completely, Proj is super fast just on points...)

Comment on lines 82 to 87
GI.LinearRing([
(xb[1], yb[1]),
(xb[2], yb[1]),
(xb[2], yb[2]),
(xb[1], yb[2]),
(xb[1], yb[1])
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might give some more micro optimisation, 4 things is usually better than 5 things with computers...

Suggested change
GI.LinearRing([
(xb[1], yb[1]),
(xb[2], yb[1]),
(xb[2], yb[2]),
(xb[1], yb[2]),
(xb[1], yb[1])
GI.LineSting([
(xb[1], yb[1]),
(xb[2], yb[1]),
(xb[2], yb[2]),
(xb[1], yb[2]),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GI.LinearRing([
(xb[1], yb[1]),
(xb[2], yb[1]),
(xb[2], yb[2]),
(xb[1], yb[2]),
(xb[1], yb[1])
GI.LineString([
(xb[1], yb[1]),
(xb[2], yb[1]),
(xb[2], yb[2]),
(xb[1], yb[2]),

typo


function _spherical_quadrilateral_area(ring)
ps = GI.getpoint(ring)
(p1, p2, p3, p4) = _lonlat_to_sphericalpoint.((ps[1], ps[2], ps[3], ps[4]))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last point isn't actually used here so my LineString approach below does make sense

Copy link

codecov bot commented Oct 4, 2024

Codecov Report

Attention: Patch coverage is 95.58824% with 3 lines in your changes missing coverage. Please review.

Project coverage is 82.61%. Comparing base (a15ebb1) to head (ae793e6).
Report is 56 commits behind head on main.

Files with missing lines Patch % Lines
ext/RastersArchGDALExt/cellarea.jl 96.29% 2 Missing ⚠️
src/extensions.jl 92.85% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #768      +/-   ##
==========================================
+ Coverage   82.32%   82.61%   +0.28%     
==========================================
  Files          60       62       +2     
  Lines        4357     4566     +209     
==========================================
+ Hits         3587     3772     +185     
- Misses        770      794      +24     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@asinghvi17
Copy link
Contributor

asinghvi17 commented Oct 5, 2024

Happy to report that this solves #764 in web mercator as well!
hello

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants